🔬 Longest Common Subsequence (LCS)

Help Dr. Maya analyze protein sequences by finding the LCS and its occurrences

👩‍🔬 Dr. Maya's Protein Analysis

🎯 The Mission:

Dr. Maya needs to find the Longest Common Subsequence (LCS) between two protein sequences and count its occurrences in proteinA.

📋 Requirements:

  • Compute the LCS of two protein sequences
  • Count how many times the LCS appears in proteinA
  • Display the LCS string
  • Handle sequences of length 1 to 1000

Input/Output Specifications

  • Input: Two strings, proteinA and proteinB (1 ≤ length ≤ 1000)
  • Output: Length of LCS, number of occurrences in proteinA, and the LCS string

Example: proteinA = ABCDGH, proteinB = AEDFHR

proteinA:

A
B
C
D
G
H

proteinB:

A
E
D
F
H
R

LCS: ADH (Length: 3, Occurs: 1 in proteinA)

⚡ LCS Explained

How LCS Works

  1. Dynamic Programming: Build a DP table to find the length of LCS
  2. Backtrack: Trace back through the DP table to construct the LCS
  3. Count Occurrences: Count how many times the LCS appears as a subsequence in proteinA

DP Table Example (ABCDGH, AEDFHR)

AEDFHR
0000000
A0111111
B0111111
C0111111
D0112222
G0112222
H0112233

LCS Length: 3, Sequence: ADH

Time Complexity

O(mn)

For DP table (m, n lengths)

Space Complexity

O(mn)

For DP table

Counting LCS

O(2^m)

Recursive counting

Why LCS?

  • ✅ Identifies common patterns in sequences
  • ✅ Useful in bioinformatics and text comparison
  • ✅ Dynamic programming ensures efficiency
  • ❌ Counting occurrences can be slow for large inputs

🔍 Step-by-Step LCS Demo

Click "Start Demo" to begin step-by-step visualization

Algorithm Progress:

1. Display input sequences
2. Build DP table
3. Backtrack for LCS
4. Count occurrences
5. Display results

Current Sequences:

proteinA:

proteinB:

DP Table:

🎮 Practice LCS

Enter sequences and click "Find LCS"

Test Cases

Example 1: proteinA = ABCDGH, proteinB = AEDFHR → Length: 3, Count: 1, LCS: ADH

Example 2: proteinA = AAAAAA, proteinB = AAXAAA → Length: 5, Count: 6, LCS: AAAAA

📊 Algorithm Analysis

LCS Process

  1. DP Table: Build a table to compute LCS length
  2. Backtrack: Reconstruct the LCS string
  3. Count: Recursively count LCS occurrences in proteinA

Time Complexity

O(mn)

For building DP table

Space Complexity

O(mn)

For DP table

Counting Time

O(2^m)

For recursive counting

Key Points

  • Dynamic Programming: Efficiently computes LCS length
  • Backtracking: Reconstructs the LCS string
  • Recursive Counting: Counts all subsequence occurrences
  • Applications: Bioinformatics, text comparison
  • Limitation: Counting can be exponential for large inputs